Controlling Cardea: Fast Policy Search in a High Dimensional Space
نویسنده
چکیده
The essential dynamics algorithm is a novel policy search algorithm for learning in a class of stochastic Markov decision processes (MDPs) with continuous state and action spaces. We apply it to the control of a 5 degree of freedom robot arm atop a Segway base. Movement of the arm causes the base to translate and tilt, which in turn affects the movement of the arm. The state space has 14 dimensions, and the action space 5 dimensions, twice the dimensionality of typical policy search applications. Despite the highly non-linear dynamics, the algorithm is able to control the robot through a wide range. What’s more, this is accomplished using very little domain knowledge and no knowledge of dynamics.
منابع مشابه
یک روش مبتنی بر خوشهبندی سلسلهمراتبی تقسیمکننده جهت شاخصگذاری اطلاعات تصویری
It is conventional to use multi-dimensional indexing structures to accelerate search operations in content-based image retrieval systems. Many efforts have been done in order to develop multi-dimensional indexing structures so far. In most practical applications of image retrieval, high-dimensional feature vectors are required, but current multi-dimensional indexing structures lose their effici...
متن کاملUsing Parameterized Black-Box Priors to Scale Up Model-Based Policy Search for Robotics
The most data-efficient algorithms for reinforcement learning in robotics are model-based policy search algorithms, which alternate between learning a dynamical model of the robot and optimizing a policy to maximize the expected return given the model and its uncertainties. Among the few proposed approaches, the recently introduced BlackDROPS algorithm exploits a black-box optimization algorith...
متن کاملتعیین ماشینهای بردار پشتیبان بهینه در طبقهبندی تصاویر فرا طیفی بر مبنای الگوریتم ژنتیک
Hyper spectral remote sensing imagery, due to its rich source of spectral information provides an efficient tool for ground classifications in complex geographical areas with similar classes. Referring to robustness of Support Vector Machines (SVMs) in high dimensional space, they are efficient tool for classification of hyper spectral imagery. However, there are two optimization issues which s...
متن کاملIntegrating value functions and policy search for continuous Markov Decision Processes
Value function approaches for Markov decision processes have been used successfully to find optimal policies for a large number of problems. Recent findings have demonstrated that policy search can be used effectively in reinforcement learning when standard value function techniques become overwhelmed by the size and dimensionality of the state space. We demonstrate that substantial benefits ca...
متن کاملPacket classification using diagonal-based tuple space search
Multidimensional packet classification has attracted considerable research interests in the past few years due to the increasing demand on policy based packet forwarding and security services. These network services typically involve determining the action to take on packets according to a set of rules. As the number of rules increases, time for determining the best matched rule for an incoming...
متن کامل